-
Notifications
You must be signed in to change notification settings - Fork 216
pkg/cvo/metrics: Add a cluster_version_capability metric #755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/cvo/metrics: Add a cluster_version_capability metric #755
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The cluster-version operator is growing this new metric [1], and Ben
Parees wants it included in Telemetry so we collect it for clusters
that have Telemetry enabled, even if Insights is disabled [2].
Cardinality is expected to be very small, with a single label ('name'
[1]) and a restricted value set (only three registered capabilities
expected for 4.11).
[1]: openshift/cluster-version-operator#755
[2]: openshift/enhancements#922 (comment)
7bb68e3 to
60ee931
Compare
Fleet capability choices should make their way up to Red Hat via Insights uploads. But some clusters have Telemetry enabled but Insights disabled [1]. For these clusters, we'll want this data in Telemetry, and getting it into metrics (this commit) is the first step. After this lands, we'll need to update the monitoring operator like [2] to get these shipped up off of cluster. [1]: openshift/enhancements#922 (comment) [2]: openshift/cluster-monitoring-operator#1477
60ee931 to
dabc1a6
Compare
The cluster-version operator is growing this new metric [1], and Ben
Parees wants it included in Telemetry so we collect it for clusters
that have Telemetry enabled, even if Insights is disabled [2].
Cardinality is expected to be very small, with a single label ('name'
[1]) and a restricted value set (only three registered capabilities
expected for 4.11).
[1]: openshift/cluster-version-operator#755
[2]: openshift/enhancements#922 (comment)
|
PromeCIeus for the e2e-operator run has: So hurray :) |
|
/lgtm |
|
/test e2e-agnostic-upgrade |
|
Azure install failure is already tracked in an installer bug, and I'm confident that there's no update-specific impact here. /override ci/prow/e2e-agnostic-upgrade |
|
@wking: Overrode contexts on behalf of wking: ci/prow/e2e-agnostic-upgrade DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@wking: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Manually labeling per Jack's earlier comment, because Prow and GitHub are fighting. |
| ch <- g | ||
| enabledCapabilities[capability] = struct{}{} | ||
| } | ||
| for _, capability := range cv.Status.Capabilities.KnownCapabilities { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not clear on why we need to send known capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two use cases:
- conditional updates using local PromQL can distinguish between "yeah, that cap is not enabled here" (value 0) and "err, something's broken with scraping this metrics" (no match)
- Telemetry can gauge the install-fraction for a capability (
countOfClustersWithCapEnabled / countOfClustersWithCapKnown). withcount by (name) (cluster_version_capability == 1) / count by (name) (cluster_version_capability). Ifcluster_version_capabilitydid not include known-but-not-enabled caps, you'd need some added filtering to get atcountOfClustersWithCapKnown.
The cluster-version operator is growing this new metric [1], and Ben
Parees wants it included in Telemetry so we collect it for clusters
that have Telemetry enabled, even if Insights is disabled [2].
Cardinality is expected to be very small, with a single label ('name'
[1]) and a restricted value set (only three registered capabilities
expected for 4.11).
[1]: openshift/cluster-version-operator#755
[2]: openshift/enhancements#922 (comment)
Fleet capability choices should make their way up to Red Hat via Insights uploads. But some clusters have Telemetry enabled but Insights disabled. For these clusters, we'll want this data in Telemetry, and getting it into metrics (this commit) is the first step. After this lands, we'll need to update the monitoring operator like this to get these shipped up off of cluster.